e-print published at http://arxiv.org/abs/0911.4397vl November 2009 



How slow is slow? 
SFA detects signals that are slower than the driving force 

Wolfgang Konen, Patrick Koch 

Institute for Informatics, Cologne University of Applied Sciences 
Steinmullerallee 1, D-51643 Gummersbach, Germany 
http : / / www . gm . f h-koeln . de/ ~konen 
wolfgang . konenOf h-koeln . de 

Abstract 

Slow feature analysis (SFA) is a method for extracting slowly varying driving forces from quickly 
varying nonstationary time series. We show here that it is possible for SFA to detect a component which 
is even slower than the driving force itself (e.g. the envelope of a modulated sine wave). It is shown 
that it depends on circumstances like the embedding dimension, the time series predictability, or the 
base frequency, whether the driving force itself or a slower subcomponent is detected. We observe a 
phase transition from one regime to the other and it is the purpose of this work to quantify the influence 
of various parameters on this phase transition. We conclude that what is percieved as slow by SFA 
varies and that a more or less fast switching from one regime to the other occurs, perhaps showing some 
similarity to human perception. 

1 Introduction 

The analysis of nonstationary time series plays an important role in the data understanding of various 
phenomena such as temperature drift in experimental setup, global warming in climate data or varying 
heart rate in cardiology. Such nonstationarities can be modeled by underlying parameters, referred to as 
driving forces, that change the dynamics of the system smoothly on a slow time scale or abruptly but rarely 
e.g. if the dynamics switches between different discrete states. (Wis03). 

Often, e.g. in EEG-analysis or in monitoring of complex chemical or electrical power plants, one is 
particularly interested in revealing the driving forces themselves from the raw observed time series since they 
show interesting aspects of the underlying dynamics, for example the switching between different dynamic 
regimes. 

Several methods for detecting and visualizing driving forces have been developed: based on recurrence 
plots (EKR87; Cas97), error dissimilarity matrix (Sch99), feedforward ANNs with extra input unit (VGNC01) 
or, as Wiskott recently proposed, by Slow Feature Analysis (SFA) (Wis03). 

What is "slow" in the driving forces compared to the raw observed time series? Often it might be the 
case that driving forces contain components on different time scales and it is crucial to understand which 
time scale will be selected by the driving force algorithm. As an example we will consider driving forces made 
up of two overlayed frequencies /i < / 2 . Will the driving force detection algorithm detect the slower one of 
the frequencies, / l5 thus being more slow, or the combined driving force made up of fi and /2, thus being 
more accurate? With this paper we try to deepen our understanding which paramaters influence whether 
the first or the second choice is taken. 

We base our analysis on SFA (WS02; Wis03) as driving force detection algrithm since it constitutes a 
versatile, robust and fast algorithm. 
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2 Slow Feature Analysis 



Slow feature analysis (SFA) has been originally developed in context of an abstract model of unsupervised 
learning of invariances in the visual system of vertebrates (Wis98) and is described in detail in (WS02; Wis03). 



2.1 The SFA algorithm 

We briefly review here the approach described in (Wis03). The general objective of SFA is to extract 
slowly varying features from a quickly varying multidimensional signal. For a scalar output signal and an 
./V-dimensional input signal x — x(t) where t indicates time and x — [xi, ...,xn] t is a vector, the question 
can be formalized as follows: Find the input-output function g(x) that generates a scalar output signal 

y(t) := g(x(t)) (1) 

with its temporal variation as slowly as possible, measured by the variance of the time derivative: 

minimize A(y) = (y 2 ) (2) 

with (•) indicating the temporal mean. To avoid the trivial constant solution the output signal has to meet 
the following constraints: 

(y) = (zero mean) , (3) 
(y 2 ) — 1 (unit variance) . (4) 

This is an optimization problem of variational calculus and as such difficult to solve. But if we constrain 
the input-output function to be a linear combination of some fixed and possibly nonlinear basis functions, 
the problem becomes tractable with the mathematical details given in (Wis03). (A typical choice for the 
nonlinear basis functions are monomials of degree 2, 

h(x) = [xt, x 2 , x N , x\, x%x 2 , x 2 N ] T , 

but other choices, e. g. monomials of higher degree or radial basis functions could be used as well.) Basically, 
SFA consists of the following four steps: (i) expand the input signal with some set of fixed possibly nonlinear 
functions; (ii) sphere the expanded signal to obtain components with zero mean and unit covariance matrix; 
(iii) compute the time derivative of the sphered expanded signal and determine the normalized eigenvector 
of its covariance matrix with the smallest eigenvalue; (iv) project the sphered expanded signal onto this 
eigenvector to obtain the output signal, which we will denote by y\. Sometimes we will be also interested in 
the second-smallest eigenvalue and the corresponding projected output signal, which we will denote by y 2 . 
As usual for eigenvectors, higher components are orthogonal to lower ones, i. e. signal yi satisfies the i — 1 
additional conditions (ViVj) = Vj = 1, ...,£— 1. 

In the rest of this paper we will work with time series x(l), x(2), . . . , x(t), t 6 N, instead of continuous 
signals x(t), t <E R, but the transfer of the algorithm described above to time series is straight forward. The 
time derivative is simply computed as the difference between successive data points assuming a constant 
sampling spacing At = 1. 



2.2 Slowness indicator rj 

It is useful to have a certain measure for the slowness of a signal. In principle A(y) is such an indicator, but 
Wiskott and Sejnowski (WS02) propose a more intuitive measure r\ defined by 



v(y) = f (5) 
^ V((y-(y)) 2 ) 

where (•) indicates the temporal average over [to, to + T]. The measurement interval T should be at least 
as long as one or two periods of the slowest signal component It is easy to show (WS02) that for a pure 
sine wave y(t) = sm{n2irt/T) the 77- value counts the number of oscillations in the observation interval, i.e. 
rj{y) — n. The denominator in Eq. (5) ensures that each linear transform ay(t) + b has the same 77 as y(t). 
(This normalization is not necessary for the training output signals of SFA, which are already by construction 
normalized to zero mean and unit variance, but it allows to apply the measure 77 to unnormalizcd time series 
like driving forces as well. Also note that SFA output signals derived from test data are only approximately 
normalized, so that the normalization is useful to make rj independent of an accidental scaling factor.) Low 
rj- values indicate slow signals, high rj- values fast signals. 
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3 Experiments 



In the following wc present examples with time series w(t) derived from a logistic map to illustrate the 
properties of SFA. The underlying driving force will always be denoted by 7 and may vary between —1 and 1 
smoothly and considerably slower (as defined by the variance of its time derivative (2)) than the time series 
w(t). The approach follows closely the work of Wiskott (Wis03), but with more systematic variations in the 
driving force. 

We consider here a driving force which is made up of two frequency components 

7 ( t) = -(sm(0.0005i//i) + sin(0.0047z^)) e [-1,1] (6) 
2 v „ v ' 

=T/s{t) =7pW 

where the first component 75 is roughly ten times slower than jp. The question is whether SFA as the 
driving force detector will detect solely the slower component 75 of the driving force (in an attempt to 
minimize 77) or the full driving force 7 (in an attempt to extract the underlying system dynamics as accurate 
as possible) . A second question is whether a phase transition between both choices might occur as we vary 
the base frequency Vf. 

In order to inspect visually the agreement between a slow SFA-signal and the driving force 7 we must 
bring the SFA-signal into alignment with 7 (the scale and offset of slow signals y(t) extracted by SFA are 
arbitrarily fixed by the constraints and the sign is random). Therefore we define an 7 -aligned signal 

A(y(t); 7 )=ay(t)+b (7) 

where the constants a and b are chosen in such a way that 

((ay(i)+6- 7 (i)) 2 ) iMin. (8) 

3.1 Logistic map in chaotic regime 

We consider a time series derived from a logistic map 

w(t + 1) = (3.9 + 0.1 7 (t))w(t)(l - w(t)) , (9) 

which maps the interval [0, 1] onto itself and has the shape of an upside-down parabola crossing the abscissa 
at and 1. Parameter r(t) = 3.9 + 0.17(f) governs the height of the parabola. Since the logistic map is in 
its chaotic regime for r > 3.57 we have with r(t) € [3.8,4.0] a highly chaotic map with no visible structre 
(Figure 1, top). 

Taking the time series w(t) directly as an input signal would not give SFA enough information to estimate 
the driving force, because SFA considers only data (and its derivative) from one time point at a time. Thus 
it is necessary to generate an embedding- vector time series as an input. Here embedding vectors at time 
point t with delay r and dimension to arc defined by 

x (t) := [w{t-r{m-l)/2), - r((m - l)/2 - 1)), ...,w{t + r(m - 1)/2)} T , (10) 

for scalar w(t) and odd m. The definition can be easily extended to even to, which requires an extra shift 
of the indices by t/2 or its next lower integer to center the used data points at t. Centering the embedding 
vectors results in an optimal temporal alignment between estimated and true driving force. 

The following simulations are based on 6000 data points each and were done with Matlab 7.0.1 (Release 
14) using the SFA toolkit sfa-tk (Ber03). 

Figure 1 shows the time series, the estimated driving force (from SFA with m — 19, r = 1 and second 
order monomials), and the true driving force. The estimated driving force is at Vf = 20 in good alignment 
with the true driving force. If we use a four times higher base frequency vj = 80 in Figure 2 we see that 
the estimated driving force is now in nearly perfect alignment with the slower component 75 (t). This is 
remarkable since the slower component is not directly visible in the driving force, only indirectly as envelope 
around the green curve (see detail plot in Figure 2, bottom). 
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Figure 1: Time series w(t) (top) derived from the logistic map with the driving force for Vf — 20, the slowest 
SFA signal yi(t) (middle, blue dots), a detail plot for < t < 500 of the true slowly varying driving force 
"f(t) (bottom, solid green line), its slower component 7s(£) (bottom, dashed red line) and the alignment (see 
Eq. (7), 8) of y\(t) to both components (bottom, blue dots). The "/-aligned output A(i/i(t);j) ( = estimated 
driving force J est (t)) is here in good agreement with the green component. The absolute correlation between 
yi(t) and the true driving force is \C(yi(t),j(t))\ = 0.9961. SFA was done with embedding dimension m = 19. 
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3.2 Logistic map in predictable regime 



Fig. 3 shows quite clearly that there is a phase transition occuring around Vf = 40. To localize the phase 
transition and to study its dependence on other parameters of the system we first generalize the logistic map 
Eq. (9) by a parameter q £ [0.1, 3.9] allowing to control the presence or absence of chaotic motion: 

w (t + 1) = (4.0 -q + 0.1 7 (t))w(i)(l - w(t)) , (11) 

which reduces to Eq. (9) for q — 0.1. Parameter q controls the predictabilty of the logistic map: For q < 0.33 
the map is fully in its chaotic regime, for 0.33 < q < 0.53 we have a mixture of chaotic and predictable 
periods and for 0.53 < q < 3.9 it is long-term predictable. In Fig. 4 we vary the base frequency Vf £ [20, 80] 
and we define the phase transition frequency v{P.T.) as the lowest Uf with \C(yi(t),-f(t))\ < \C(yi(t),js(t))\, 
where C(-, •) denotes the usual correlation. Fig. 4, right, shows that for small q = 0.1 (fully chaotic w) a a 
phase transition occurs at Vf — 36 while for larger q — 0.4 (mix of chaotic and non-chaotic periods in w) the 
phase transition happens earlier at Vf = 20 (Fig. 4, left). 
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Figure 2: Same as Fig. 1 but with base frequency uj — 80. The estimated driving force (bottom, blue dots) is 
now in good agreement with the slower red component 7s , (t). (We see two blue dotted curves since we align 
the slowest SFA signal once with j(t) and once with js {t) ■) The absolute correlation between y\{t) and the 
slow component is \C(yi(t), 7s(t))| = 0.9974. 



3.3 The phase transition as a function of q and m 

How does the phase transition frequency v(P.T.) vary as a function of the predictability q and the embedding 
dimension m of the SFA-input signal? Both parameters are varied systematically over a broad range and 
the results are depicted in Fig. 5. First of all it is interesting to note that the SFA algorithm, being basically 
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Figure 3: The bottom panel of Fig. 1 for three base frequencies Vf = 20,40,60 clearly shows the phase 
transition from the complete driving force "f(t) (green solid line) to its slower subcomponent 75 (t) (red dashed 
line). (We see two blue dotted curves since we align the slowest SFA signal once with j(t) and once with 
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Figure 4: Absolute correlation \C(yi(t),j(t))\ and \C(yi(i), 7s(i))| of the slowest SFA-signal yi with the 
driving force 7 and with its slow component 75 (dash-dotted and dashed lines). For sufficient large Vf the 
correlation with the slow component 7s is stronger than the correlation with 7. Left: q = 0.1 (logistic map 
in its chaotic regime). The quotient tisfaMi = vti/^/vd) (solid red line), being initially 1 or higher, drops 
largely for v$ > 36 and eventually approaches a value of only 0.1, which corresponds to r]('ls)/' n (l)- SFA was 
carried out with monomials of degree 2, embedding dimension m = 19 and delay r = 1. Right: The same 
for q = 0.4 (logistic map, only partly chaotic) gives the same qualitative results, only the phase transition is 
shifted to lower frequency Vf = 17. 

parameter-free, works very successfully over this broadly varying input material, which makes SFA a robust 
and versatile algorithm. Tab. 1 shows the slowness indicator rj for some m and q. 

Before we discuss the results further in Sec. 4, a second remark is in order concerning the SFA im- 
plementation sfa-tk (Ber03): While it worked well for small embedding dimensions m, larger m led quite 
inevitably to wrong "slow" signals j/i which were neither slow nor did they respect the unit variance condition 
(y 2 ) = 1. In an accompanying second paper (Kon09) we trace this behaviour back to numeric instabilities of 
the implementation and present a slightly modified implementation (closer along the lines of (WS02)) and 
based on SVD which successfully avoids these numeric instabilities. This modified implementation is used 
throughout the experiments in this paper. 

4 Discussion 

It is important for driving force analysis with SFA to understand the mechanisms by which the slowest signal 
is selected. If the driving force contains two components of different frequencies two interesting things might 
happen: If the base frequency Vf is large enough then SFA will return the slower component as slowest signal. 
This is quite remarkable, since SFA detects a signal with a smaller 77 than the driving force itself. Recall 
that this slower component is not directly visible in the driving force, only indirectly as the envelope. But 
after all, it is also quite understandable: If we view the dynamical system as a two-stage process where the 
slow component 75 is considered as a modulating force acting on the other (faster) component 7f with the 
output of this stage acting on the dynamical system, then in such a system description the slower component 
7s becomes directly visible. 

Surprisingly, if we lower the base frequency Vf, we reach the point where the slow component comes "out 
of sight" and the slowest signal returned by SFA is well-aligned with the driving force itself (slow plus fast 
component) . Why is the slow component alone no longer detected by SFA? We hypothesize that two reasons 
are responsible for this: 

• The embedding dimension m might be too low so that the slow component is perceived within the 
embedding vector as nearly constant. Then increasing m should lower the phase transition frequency 
and finally detect the slow component. 
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Figure 5: Left: Phase transition frequency v(P.T.) as a function of q and m. In the area above each phase 
transition curve the detection of the slower component 75 of the driving force is stronger. Right: The absolute 
correlation values at fixed vt — 40 show a phase transition towards the slower component as we increase the 
embedding dimension m (top) or the value q ('predictability' of the logistic map, bottom). 



• Another reason might be the chaotic nature of the logistic map. The slow component might vary on a 
time scale which is slower than the time scale of 'forgetting' in the chaotic logistic map and thus this 
component alone is not detectable by SFA. If this is true then moving to a better predictable region 
of the logistic map (increasing q) should make the slow component again detectable. 

Both hypotheses are well-supported by the result shown in Fig. 5. On the left-hand side we see the location 
of the phase transition. For most input signals which are a function of q and Vf there seems to be a 
sufficient large m so that the slow component becomes detectable. For q — 0.7 this occurs already at 
very low frequencies. The curve for q = 0.6 (not shown) is for m > 10 very similar to q — 0.7, which is 
well-understandable if we recall that all q > 0.53 make the time series long-term predictable, thus even a 
very slow subcomponent becomes detectable. On the right-hand side of Fig. 5 we see that both methods, 
increasing m or increasing q finally lead to a reliable detection of the slow subcomponent as it is claimed by 
our hypothesis. 

Other slow signals Consider the case where the slowest signal j/i aligns well with the slow component 75. 
One might expect that the second slow signal yi aligns well with the fast component 7^. But it turns out, 
that usually the next slow signals y2,Va, Hi, ■ ■ ■ have a similar slowness as 7s although they are orthogonal 
to y±. Only at some later index, e.g. j/5 or j/10, corresponding to the 5th or 10th smallest eigenvalue, the 
slowness suddenly jumps to higher values and reveals a signal well-aligned with jp. Note that if the slowest 
signal yi has a high correlation with 75, no other signal will align perfectly with the complete driving force 
because this would be not orthogonal to 75 and is thus avoided. 

Robustness of SFA SFA as tested in this paper works robustly over a large range of parameters i/f, m, q. 
It is however necessary to deal carefully with zero eigenvalues which occur frequently when the embedding 
dimension m is large and/or the noise is low. If zero eigenvalues are not handled it is likely to see numerical 
instabilities. A numerically robust way to handle them is based on SVD and is described in (Kon09). 

Accuracy of the Estimated Driving Force The driving force is estimated with high accuracy, although 
the estimation is undetermined up to any invertible transformation (Wis03). We found that his is true even 
for large embedding dimension, e. g. m — 51, in contrast to the hypothesis in (Wis03) that only small 
embedding dimensions would avoid more complicated invertible transformations. 

Noise Sensitivity The results described in this paper were obtained with noise-free data. We tested in 
some simulations the effects of adding Gauusian noise to the data before embedding. For m = 19, vt = 56 
and q = 0.4 the effect of adding 1%, 2% and 5% noise brought the correlation between slow component 75 
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Table 1: Slowness indicator rj for various m (embedding dimension) and q (predictability). Large rj indicate 
fast, small 7/ indicate slow signals. The true driving force 7 has rj = 127 while its slow component 73 has 
T) = 19.1. 
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and slowest SFA-signal yi from |C| = 0.999 down to 0.85, 0.75 and 0.60, resp. Thus for small noise levels 
< 1% the main effects persist, but in contrast to (Wis03) the noise sensitivity is somewhat higher, which 
means that 5% noise destroy most of the correlation with the slow component. This might be due to the 
more complex nature of the driving force build up from multiple components. On the other hand we found 
that larger embedding dimensions, e. g. m = 51 stabilize the system and bring up the correlation again to 
0.963, 0.893 and 0.804, resp., but further experiments are needed to investigate this systematically. 

High-Dimensional Input Data As the preceeding paragraph has shown higher dimensional input data 
usually improve the SFA performance. However, since the number of monomials grows quadratically with 
the embedding dimension, the requirements in computer memory and computing time quickly increases. It 
is therefore interesting to investigate whether similar results as with high m can be obtained by hierarchical 
approaches where first smaller parts of the embedding vector are analyzed and then combined by a final 
SFA, as has been demonstrated in (WS02). 

Connection to human perception Since SFA has been originally developed as a model for neural 
information processing (Wis98), it might be natural to ask, whether the observed switch between components 
and its phase transition has any parallel in human perception or human motion coordination. Several 
phenomena with switching effects are discussed in the literature: 

• The well-known backward spinning-wheel illusion (PPA96) occurs frequently in movies or under stro- 
boscope lighting conditions and it shows the transition from a fast forward rotation percept to a slow 
backward rotation percept. This effect is usually explained by the snapshot-like presentation of the 
percept which has ambiguous motion interpretations. Somewhat less known is that a similar, although 
harder to perceive effect can occur under plain sunlight and direct view with the eye (KHE04; PPA96). 
No snapshot-like explanation is possible here, the percept is continous having a greater resemblance 
to the smoothly varying driving force of our SFA experiments. A possible explanation of the sunlight 
spinning-wheel illusion is according to (KHE04) that rivalry between different motion detectors in the 
brain occurs. 

• Another well-known phase transition occurs in bimanual motion coordination when performing certain 
movements with the index fingers of both hands (Kcl81) for which a theoretical model exists, the 
so-called Haken-Kelso-Bunz model (HKB85). The Haken-Kelso-Bunz model also describes a phase 
transition and certain hysteresis effects. 

SFA has shown similar capabilities in the sense that the same setup can learn to synchronize with different 
components of a driving force, depending on the experiment conditions. It remains however to be studied, 
whether one trained SFA system can (without further learning) switch between different components when 
applied to signals with smoothly varying base frequency and whether a hysteresis effect can be observed. 

5 Conclusion 

In this paper we have investigated the notion of slowness in SFA. It has been verified that SFA can reliably 
detect either slow driving forces or their subcomponents over a broad range of parameters in nonstationary 
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time series, even in the presence of chaotic motion. 

However it has also been seen that what is perceived as slow can vary for driving forces made up of 
components on different time scales: Depending on the embedding dimensions and the predictability of 
the underlying dynamical system we observe phase transitions where the slowest SFA-signal moves from 
alignment to a slow subcomponent to alignment with the (faster varying) complete driving force. Notably 
when alignment to the slow subcomponent occurs, SFA is capable of detecting slow signals with an 77-indicator 
considerably lower than the 77-value of the true driving force. 

There are still a number of open questions. Does hierarchical SFA, which achieves larger embedding 
dimensions with a smaller computing budget, allow for the detection of very slow components which are 
'out-of-reach' for plain SFA? Can an extended SFA system model more closely certain switching effects 
known from human perception (as for example the backward spinning-wheel illusion (PPA96))? Finally, it 
is necessary to apply SFA to real world data and to see whether the results reported in this study have some 
similar parallel there. 

In real world data it will often not be possible to vary the base frequency or the degree of nonlinearity in 
the observed dynamical system systematically. Therefore, one advice from the present study should be to 
vary the embedding dimension over a broad range in order to detect possible slow signals which otherwise 
might be hidden. In any case, SFA has shown to be robustly working on a broad range of input data and 
it is able to reveal subtle components in the driving forces, thus making it a versatile tool for driving force 
detection. 
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